Skip to content

fix: Update Search Index after Tag update#530

Open
jesperhodge wants to merge 9 commits intoopenedx:mainfrom
jesperhodge:jhodge/tag-change-updates-meilisearch
Open

fix: Update Search Index after Tag update#530
jesperhodge wants to merge 9 commits intoopenedx:mainfrom
jesperhodge:jhodge/tag-change-updates-meilisearch

Conversation

@jesperhodge
Copy link
Copy Markdown

@jesperhodge jesperhodge commented Apr 2, 2026

Description

This implements a fix for openedx/modular-learning#258 .

The issue was that after you rename a tag on a taxonomy from the taxonomy frontend in Course Authoring, the Meilisearch search index for associated tagged object does not get updated. That results in some outdated data being shown in the libraries -> tags search filter.

Use of AI

I used AI to implement some of the code, especially the introduction of the Celery task. I used it to help with tests as well, but carefully reviewed everything, and followed a red-green TDD process with the tests to ensure correctness.

I only use AI for support in targeted spots rather than have an agent just write code for me.

I have tested manually as well.

Testing Instructions

  • Go to Course-Authoring
  • Go to the libraries tab
  • Go to the detail view of one library
  • Choose a library block
  • Click on "manage tags"
  • Apply a tag to the block that you want to test renaming on -> click save
  • At the top, under "All content", you find a search bar. Right next to the search input field, there is a filter dropdown called "Tags". Open it
  • When you expand the taxonomy of the tag you added, you should find that tag there now - with its current name
  • In a separate tab, go to Course Authoring -> Taxonomies
  • Open the detail view for the taxonomy whose tag you applied
  • Find the tag in the table and click "Rename"
  • Rename the tag to something else, and click save
  • Go back to the tab where you have the library with the tag applied
  • Reload the page
  • In the "Tags" filter dropdown, now you should see the renamed tag under its new name, not its old name

@openedx-webhooks openedx-webhooks added the open-source-contribution PR author is not from Axim or 2U label Apr 2, 2026
@openedx-webhooks
Copy link
Copy Markdown

openedx-webhooks commented Apr 2, 2026

Thanks for the pull request, @jesperhodge!

This repository is currently maintained by @axim-engineering.

Once you've gone through the following steps feel free to tag them in a comment and let them know that your changes are ready for engineering review.

🔘 Get product approval

If you haven't already, check this list to see if your contribution needs to go through the product review process.

  • If it does, you'll need to submit a product proposal for your contribution, and have it reviewed by the Product Working Group.
    • This process (including the steps you'll need to take) is documented here.
  • If it doesn't, simply proceed with the next step.
🔘 Provide context

To help your reviewers and other members of the community understand the purpose and larger context of your changes, feel free to add as much of the following information to the PR description as you can:

  • Dependencies

    This PR must be merged before / after / at the same time as ...

  • Blockers

    This PR is waiting for OEP-1234 to be accepted.

  • Timeline information

    This PR must be merged by XX date because ...

  • Partner information

    This is for a course on edx.org.

  • Supporting documentation
  • Relevant Open edX discussion forum threads
🔘 Get a green build

If one or more checks are failing, continue working on your changes until this is no longer the case and your build turns green.

Details
Where can I find more information?

If you'd like to get more details on all aspects of the review process for open source pull requests (OSPRs), check out the following resources:

When can I expect my changes to be merged?

Our goal is to get community contributions seen and reviewed as efficiently as possible.

However, the amount of time that it takes to review and merge a PR can vary significantly based on factors such as:

  • The size and impact of the changes that it introduces
  • The need for product review
  • Maintenance status of the parent repository

💡 As a result it may take up to several weeks or months to complete a review and merge your PR.

@jesperhodge
Copy link
Copy Markdown
Author

Technical Findings

Here are some helpful insights from our slack conversation as to how this can be implemented.

I'm just going to post the conversation here.

"What repo does meilisearch originate from?
19 repliesBraden MacDonald  [5:44 PM]
Can you clarify the question? I don't think you're asking for https://github.com/meilisearch/meilisearch right?
Tyler Bain  [5:46 PM]
Probably not - I’m looking into an issue where we need to update the search index after a tag update; openedx/modular-learning#258, I was curious what repo we have that either has the code for it, or the management for our instance of meilisearch is saved?
Braden MacDonald  [5:47 PM]
OK, we use Meilisearch for a few different things in a few different repos, but what you're looking for is probably https://github.com/openedx/openedx-platform/blob/master/openedx/core/djangoapps/content/search/handlers.py
Tyler Bain  [5:47 PM]
Ah! Nice, thank you! 🙂
Braden MacDonald  [5:48 PM]
The CONTENT_OBJECT_ASSOCIATIONS_CHANGED event seems specifically related to tags
Tyler Bain  [5:49 PM]
Excellent, thanks for shortening that research, I appreciate it
Jesper Hodge  [5:49 PM]
For curiosity, is that an event that should be triggered via the event bus?
[5:50 PM]Or just django internal inside CMS
Braden MacDonald  [5:59 PM]
Mmm, I am not sure tbh. But since the events are defined in openedx-events, I think that may mean they use the event bus?
For your purposes the details of the transport layer hopefully don't matter too much though ?Jesper Hodge  [6:09 PM]
Apparently not, all the better.
I see you can use this like here https://github.com/openedx/openedx-platform/blob/e3ab6345b633094f322cf208fd8d6c074[…]0a/openedx/core/djangoapps/content_libraries/signal_handlers.py
Jesper Hodge  [1:33 PM]
Hi @Braden,  CONTENT_OBJECT_ASSOCIATIONS_CHANGED is a signal handler for one object. A tag that's renamed could be applied to very many objects.
The content.search.api method used for this update is def upsert_content_object_tags_index_doc:
https://github.com/openedx/openedx-platform/blob/76462f1e5fa9b37d2621ad7ad19514b403908970/openedx/core/djangoapps/content/search/api.py#L908.
In order to not send thousands of search index updates, I assume I will need to write a new method for this api that batch updates docs -> searchable_doc_tags here?
Does that approach sound correct to you?Braden MacDonald  [1:39 PM]
I'm not sure it's worth the effort. Tags are renamed rarely and as long as the updates are processed asynchronously by a worker, it may be totally fine to have thousands of events and updates ?
Jesper Hodge  [1:42 PM]
I wouldn't know, because I don't know how meilisearch handles this, if it's too much load or if it's trivial. Do you know @Dave Ormsbee ?
Braden MacDonald  [1:46 PM]
It would be most efficient as a global update using a RHAI function to just rename all instances of a tag directly, but single-item index updates are still very fast. The delay is usually going to be more on the Open edX side processing and building the updated index values, not on the Meilisearch side of updating the index.
[1:47 PM]And in this case, since we're strictly processing the tags and not re-generating the entire document for each changed entity, the Open edX side should also be relatively fast.
Jesper Hodge  [1:47 PM]
Okay sounds fine then
Braden MacDonald  [1:47 PM]
You could mention the RHAI function approach in a comment, but for now I think we want to avoid tying ourselves too much more to Meilisearch-specific features on the backend.
Jesper Hodge  [1:47 PM]
Thanks that helps
Braden MacDonald  [2:55 PM]
Here is the guidance on emitting many events from the openedx-events repo: "we recommend that simple handlers are synchronous and the sender of the event should send the event(s) from an async celery task if it is expected to result in a lot of handlers being called."
"

@mphilbrick211 mphilbrick211 moved this from Needs Triage to Waiting on Author in Contributions Apr 6, 2026
@jesperhodge jesperhodge changed the title feat: add signal handler fix: Update Search Index after Tag update Apr 6, 2026
Reset the requirements to their state before I ran make upgrade, thus fixing version upgrade problems; and then run make compile-requirements to install openedx-events without upgrading other packages
@jesperhodge jesperhodge marked this pull request as ready for review April 7, 2026 21:39
@jesperhodge
Copy link
Copy Markdown
Author

Note to reviewers (e.g. @bradenmacdonald ): I'll implement this as an async celery task now and add a few small performance improvements.

@jesperhodge
Copy link
Copy Markdown
Author

Manual testing confirms that after the refactor to a celery task, the signal is still sent and received.
Screenshot 2026-04-08 at 11 54 32 AM

Copy link
Copy Markdown
Contributor

@ormsbee ormsbee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of minor requests. Also, please bump the version for the repo and rebase + squash your commits. Thank you!


@shared_task
def emit_content_object_associations_changed_for_tag_task(tag_id: int) -> int:
"""Emit content association changed events for all objects linked to the given tag id."""
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use the docstring to expand a little on what's going on here, and how it fits into the bigger picture. It might not be intuitive to folks that we're doing this for the benefit of search indexing that is happening at a higher level in openedx-platform.

return

transaction.on_commit(
lambda: emit_content_object_associations_changed_for_tag_task.delay(tag_id)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code looks correct, but as a general practice, please use partials instead lambdas for this kind of thing: https://adamj.eu/tech/2022/08/22/use-partial-with-djangos-transaction-on-commit/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

open-source-contribution PR author is not from Axim or 2U

Projects

Status: Waiting on Author

Development

Successfully merging this pull request may close these issues.

4 participants